虽然以完全可差异的模型的端到端学习在自然语言过程(NLP)和机器学习中取得了巨大的成功,但最近的近期兴趣与潜在的离散结构一起学习以改善最新的最终任务性能和更好的归纳偏差更好的解释性。然而,该范例并不直接地适应主流梯度的优化方法。这项工作调查了三个主要的方法来学习此类模型:通过采样,替代梯度,连续放松和边缘似然最大化。我们结束了对这些方法的应用以及检查他们诱导的学习潜在结构的检查。
translated by 谷歌翻译
对于自然语言处理系统,两种证据支持在大型未解除的基层上的神经语言模型中使用文本表示:在应用程序启发基准上的表现(Peters等,2018年,除其他外)以及出现的出现这些陈述中的句法抽象(Tenney等,2019年,尤其)。另一方面,缺乏接地的监督呼吁质疑这些表现如何捕获意义(Bender和Koller,2020)。我们对最近的语言模型应用小说探针 - 特别关注由语义依赖性运作的谓词参数结构(Ivanova等,2012) - 并发现,与语法不同,语义不是通过今天的预磨款模型带到表面上。然后,我们使用卷积图编码器将语义解析明确地将语义解析结合到特定于任务的FineTuning中,为胶水基准测试中的自然语言理解(NLU)任务产生益处。这种方法展示了通用(而不是任务特定的)语言监督的潜力,以上和超越传统的预威胁和芬特。有几个诊断有助于本地化我们方法的好处。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
在狭窄的空间中,基于传统层次自治系统的运动计划可能会导致映射,定位和控制噪声引起碰撞。此外,当无映射时,它将被禁用。为了解决这些问题,我们利用深厚的加强学习,可以证明可以有效地进行自我决策,从而在狭窄的空间中自探索而无需地图,同时避免碰撞。具体而言,基于我们的Ackermann-Steering矩形Zebrat机器人及其凉亭模拟器,我们建议矩形安全区域来表示状态并检测矩形形状的机器人的碰撞,以及无需精心制作的奖励功能,不需要增强功能。目的地信息。然后,我们在模拟的狭窄轨道中基准了五种增强学习算法,包括DDPG,DQN,SAC,PPO和PPO-DISCRETE。经过训练,良好的DDPG和DQN型号可以转移到三个全新的模拟轨道上,然后转移到三个现实世界中。
translated by 谷歌翻译
多亏了机器人技术的快速发展,机器人割草正在兴起,使人类摆脱了繁琐且耗时的景观工作。传统上,机器人割草被认为是“覆盖道路计划”问题,简化了将非凸障碍转换为凸障碍的障碍。此外,机器人的包围通常会扩张转换后的障碍物以避免碰撞。但是,当适用于机器人割草时,草坪上的障碍通常是非凸的,请想象一下草坪上的一个花园,这样上面提到的障碍物处理方法将填补某些凹面区域,以使机器人再也无法访问了它们,因此沿着草坪边缘产生不可避免的未切割区域,从而使景观的优雅降低并激发了返工。为了缩小草坪边缘周围的未切割区域,我们在此将问题重新构架为一个全新的问题,称其为“边缘覆盖路径计划”问题,该问题专门用于路径计划,以覆盖边缘。相应地,我们提出了两种计划方法,即“大小磁盘”和“滑动筷子”计划方法,以通过利用图像形态处理和计算几何技巧来解决问题。通过验证,我们提出的方法可以胜过传统的“逐一扩张”方法。
translated by 谷歌翻译
许多现实世界中普遍存在的应用程序,例如停车建议和空气污染监测,都能从准确的长期时空预测(LSTF)中受益匪浅。 LSTF利用了空间和时间域,上下文信息和数据中固有模式之间的长期依赖性。最近的研究揭示了多画望神经网络(MGNN)提高预测性能的潜力。但是,由于几个问题,现有的MGNN方法不能直接应用于LSTF:一般性低,不充分使用上下文信息以及不平衡的图形融合方法。为了解决这些问题,我们构建了新的图形模型,以表示每个节点的上下文信息和长期时空数据依赖性结构。为了融合跨多个图形的信息,我们提出了一个新的动态多绘图融合模块,以通过空间注意力和图形注意机制来表征图中节点和跨图的节点的相关性。此外,我们引入了可训练的重量张量,以指示不同图中每个节点的重要性。在两个大规模数据集上进行的广泛实验表明,我们提出的方法显着改善了LSTF预测任务中现有图形神经网络模型的性能。
translated by 谷歌翻译
耐药性是对全球健康的重大威胁,以及整个疾病和药物发育的临床治疗中的重要疑虑。与药物结合有关的蛋白质中的突变是适应性耐药性的常见原因。因此,对突变如何影响药物和靶蛋白之间的相互作用的定量估计对于药物开发和临床实践来说是至关重要的。已经证明,依赖于分子动力学模拟,Rosetta方案以及机器学习方法的计算方法能够预测对蛋白质突变的配体亲和力变化。然而,严重限制的样本量和重质噪声诱导的过烧和泛化问题已经很广泛地采用了用于研究耐药性的机器学习。在本文中,我们提出了一种稳健的机器学习方法,称为Spldextratees,其可以准确地预测蛋白质突变并鉴定引起抗性突变的配体结合亲和力。特别是,所提出的方法按照易于学习的样本开始的特定方案级别,逐渐融入训练中的特定方案,然后在训练中迭代,然后在样本权重再验计算和模型更新之间迭代。此外,我们计算了基于物理的基于物理的结构特征,为机器学习模型提供了对这种数据有限预测任务的蛋白质的有价值的域知识。该实验证实了提出的方法在三种情况下预测激酶抑制剂抗性的方法,并实现了与分子动力学和Rosetta方法相当的预测准确性,具有较少的计算成本。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.
translated by 谷歌翻译